Skip to content

Conversation

gs-olive
Copy link
Collaborator

Description

Duplicate of PR #1488 targeted at Release 1.3
Fix compilation issues with Citrinet-1024 arising from type-casting issue in aten::sum and layer-naming issue in aten::div.

  • Enable automatic type-casting in aten::sum for bool tensor inputs to agree with Torch casting behavior
  • Fix bug in aten::div where all internal div layers have the same name
  • Add test cases for aten::sum type-casting

Without type-casting, the error arises from the following:

DEBUG: [Torch-TensorRT - Debug Build] - ITensor shape: [1, 1, 256]
DEBUG: [Torch-TensorRT - Debug Build] - ITensor type: Bool
DEBUG: [Torch-TensorRT - Debug Build] - InDims 1 1 256
DEBUG: [Torch-TensorRT - Debug Build] - Dim to reduce(original):[-1]
DEBUG: [Torch-TensorRT - Debug Build] - Dim to reduce(converted):[2]
DEBUG: [Torch-TensorRT - Debug Build] - Axis Mask: 00000000000000000000000000000100
DEBUG: [Torch-TensorRT - Debug Build] - Keep dims: 1
WARNING: [Torch-TensorRT - Debug Build] - Sum converter disregards dtype
ERROR: [Torch-TensorRT TorchScript Conversion Context] - 3: %66 : Tensor = aten::sum(%mask2.2, %65, %26, %3779)
# ReduceLayer only supports Float, Half, Int8, and Int32 data types.

With type casting, one error is resolved, but another appears:

ERROR: [Torch-TensorRT TorchScript Conversion Context] - 4: [network.cpp::validate::2761] Error Code 4: Internal Error (Repeated layer name: tmp_div (layers must have distinct names))
ERROR: [Torch-TensorRT TorchScript Conversion Context] - 2: [builder.cpp::buildSerializedNetwork::742] Error Code 2: Internal Error (Assertion engine != nullptr failed. )

With both updates, the model compiles succesfully.

Fixes #1487

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • [ x ] My code follows the style guidelines of this project (You can use the linters)
  • [ x ] I have performed a self-review of my own code
  • [ x ] I have commented my code, particularly in hard-to-understand areas and hacks
  • [ x ] I have made corresponding changes to the documentation
  • [ x ] I have added tests to verify my fix or my feature
  • [ x ] New and existing unit tests pass locally with my changes
  • [ x ] I have added the relevant labels to my PR in so that relevant reviewers are notified

@gs-olive gs-olive changed the title fix: Repair Citrinet-1024 compilation issues [Duplicate of [PR #1488](https://github.com/pytorch/TensorRT/pull/1488)] fix: Repair Citrinet-1024 compilation issues [Duplicate of PR #1488] Nov 28, 2022
@github-actions github-actions bot added component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters component: core Issues re: The core compiler component: tests Issues re: Tests labels Nov 28, 2022
@gs-olive gs-olive changed the title fix: Repair Citrinet-1024 compilation issues [Duplicate of PR #1488] fix: Repair Citrinet-1024 compilation issues [Duplicate of PR #1488 for Release 1.3] Nov 28, 2022
@gs-olive gs-olive added the release: v1.3 Tagged to be included in v1.3 label Nov 28, 2022
@gs-olive gs-olive self-assigned this Nov 28, 2022
@gs-olive gs-olive force-pushed the citrinet_bugfix_1.3 branch from bae9f38 to 6e8f9a3 Compare November 29, 2022 18:18
@gs-olive gs-olive force-pushed the citrinet_bugfix_1.3 branch 2 times, most recently from bae9f38 to 651f795 Compare November 30, 2022 02:56
- Enable automatic type-casting in `aten::sum` for bool tensor inputs to
agree with Torch casting behavior
- Fix bug in `aten::div` where all internal div layers have the same
name
- Add test cases for `aten::sum` type-casting
@gs-olive gs-olive force-pushed the citrinet_bugfix_1.3 branch from 651f795 to 7a44a92 Compare November 30, 2022 16:48
Copy link
Collaborator

@narendasan narendasan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good

@narendasan narendasan merged commit 8dc1a06 into pytorch:release/1.3 Nov 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed component: api [Python] Issues re: Python API component: build system Issues re: Build system component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters component: core Issues re: The core compiler component: fx component: runtime component: tests Issues re: Tests release: v1.3 Tagged to be included in v1.3
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants